ATOM Documentation

← Back to App

50 E2E Test Scenarios - Bug Analysis and Fixes

**Date:** 2026-02-09

**Test Script:** scripts/test_50_scenarios.py

**Results:** 72% pass rate (36/50 tests)

---

Test Results Summary

CategoryPass RateStatus
Auth5/5 (100.0%)✅ Perfect
Agent5/7 (71.4%)⚠️ Issues
Execution6/6 (100.0%)✅ Perfect
Graduation5/7 (71.4%)⚠️ Issues
Episodes4/5 (80.0%)⚠️ Minor issues
Admin2/5 (40.0%)❌ Major issues
Errors5/5 (100.0%)✅ Perfect
Edge Cases0/5 (0.0%)❌ All failed
Performance3/3 (100.0%)✅ Perfect
Integration1/2 (50.0%)⚠️ Issues

---

Root Cause Analysis

1. HTTP 429 Errors - Quota Enforcement (Not Rate Limiting)

**Tests Affected:**

  • [Agent] Enforce quota limits - "Quota enforced at agent 7"
  • [Agent] Create agent with all capabilities - HTTP 429
  • [Edge Cases] Handle unicode/special chars - HTTP 429
  • [Edge Cases] Handle long agent names - HTTP 429
  • [Edge Cases] Handle invalid maturity level - HTTP 429

**Root Cause:**

The HTTP 429 errors are from QuotaManager.check_agent_quota() which raises HTTP 429 when the agent limit is reached, NOT from rate limiting.

# backend-saas/core/quota_manager.py:99
raise HTTPException(
    status_code=429,
    detail=f"Agent limit reached ({agent_count}/{limit}). Please upgrade your plan."
)

**Issue:**

  • Solo plan allows 10 agents
  • Test creates 6 agents in basic agent tests
  • Previous test runs may have left agents in database
  • No cleanup between runs causes accumulation

**Quota Limits:**

free: 3 agents
solo: 10 agents (QuotaManager.QUOTAS["solo"]["max_agents"])
team: 25 agents
enterprise: 1000 agents

**Fix Required:**

  1. Add test cleanup to delete agents after each test run
  2. Or use unique tenant subdomain for each test run
  3. Or increase solo plan quota for testing

---

2. Admin Creation Response Missing Role

**Test Affected:**

  • [Admin] Create workspace admin - Role: N/A

**Root Cause:**

The create-admin endpoint returns TestAuthResponse (without role field) instead of AdminAuthResponse.

# backend-saas/api/routes/test_auth_routes.py:271
return TestAuthResponse(  # ❌ Missing role field
    user_id=str(user.id),
    tenant_id=str(tenant.id),
    test_token=test_token,
    email=user.email,
    name=user.name  # ❌ Should be user.first_name or user.email
)

**Fix Applied:**

Changed return to AdminAuthResponse with role field:

return AdminAuthResponse(
    user_id=str(user.id),
    tenant_id=str(tenant.id),
    test_token=test_token,
    token_type="test",
    email=user.email,
    name=user.first_name or user.email,
    role=user.role  # ✅ Now includes role
)

**Status:** ✅ FIXED

---

3. Promotion/Demotion HTTP 500 Errors

**Tests Affected:**

  • [Graduation] Promote agent with auth - HTTP 500
  • [Graduation] Demote agent with auth - HTTP 500
  • [Admin] Promote with JWT auth - HTTP 500
  • [Admin] Demote with JWT auth - HTTP 500

**Root Cause:**

The test tries to promote an agent from one tenant using an admin user from a different tenant.

# Test setup creates:
self.tenant_id = "team-plan-tenant"  # From setup_tenant()
self.admin_tenant_id = "admin-tenant"  # From setup_admin()
self.agent_id = "agent-in-team-tenant"

# Promotion test tries to:
POST /api/graduation/agents/{agent_id}/promote
Headers: {
    "Authorization": f"Bearer {self.admin_token}",  # Admin from admin-tenant
    "X-Tenant-ID": self.admin_tenant_id,  # Different tenant!
    "X-User-ID": self.admin_user_id
}

**Backend Logic:**

# backend-saas/api/routes/graduation_routes.py:369
tenant_id = await extract_tenant_id(request)  # Gets admin_tenant_id
user_id = await extract_user_id(request)

# Tries to find agent in admin_tenant_id, but agent is in team-tenant
# Returns 500 error when agent not found

**Fix Required:**

  1. Create admin user in the same tenant as the agent
  2. Or create a test agent in the admin tenant for promotion tests
  3. Update test to use same tenant for both agent and admin

---

4. Episode Feedback Test Flow Issue

**Test Affected:**

  • [Episodes] Submit episode feedback - "Failed to create episode"

**Root Cause:**

The test tries to manually create an episode before submitting feedback, but the episode creation endpoint requires a valid execution_id.

# Test flow:
1. Create episode with POST /api/test/episodes/create  # ❌ This endpoint doesn't exist
2. Submit feedback to created episode  # Never reaches here

**Correct Flow:**

Episodes are automatically created during agent execution. The test should:

  1. Execute an agent skill (creates episode)
  2. Get the episode_id from execution response
  3. Submit feedback for that episode

**Fix Required:**

Update test to use real agent execution flow:

# 1. Execute agent (creates episode)
exec_response = requests.post(
    f"{BASE_URL}/api/test/agents/{agent_id}/execute",
    json={"skill_name": "read", "params": {"query": "test"}},
    headers={...}
)
execution_id = exec_response.json()["execution_id"]

# 2. Get episode from execution
episode_response = requests.get(
    f"{BASE_URL}/api/graduation/agents/{agent_id}/episodes?limit=1",
    headers={...}
)
episode_id = episode_response.json()["episodes"][0]["id"]

# 3. Submit feedback
feedback_response = requests.post(
    f"{BASE_URL}/api/graduation/episodes/{episode_id}/feedback",
    json={"feedback_score": 0.8, "feedback_notes": "Great work!"},
    headers={...}
)

---

5. Edge Case Validation Errors

**Tests Affected:**

  • [Edge Cases] Handle zero episode count - HTTP 422
  • [Edge Cases] Handle concurrent creation - 0/3 successful

**Root Cause for Zero Episode Count:**

The readiness endpoint validates episode_count parameter:

# backend-saas/api/routes/graduation_routes.py:45
class ExamRequest(BaseModel):
    episode_count: int = Field(default=30, ge=10, le=100)  # ❌ Requires >= 10

**Fix Required:**

Test should use episode_count=10 (minimum valid value) instead of 0.

**Root Cause for Concurrent Creation:**

Concurrent agent creation requests may hit quota enforcement simultaneously before quota is updated.

**Fix Required:**

  1. Add delays between concurrent requests
  2. Or use sequential creation for reliability
  3. Or handle 429 responses and retry after delay

---

Rate Limiting vs Quota Enforcement

Common Misconception

**HTTP 429** errors in this test are NOT from rate limiting:

FeatureRate LimitingQuota Enforcement
SourceAbuseProtectionService.checkRateLimit()QuotaManager.check_agent_quota()
StorageRedis (sliding window)PostgreSQL (persistent count)
BypassX-Test-Secret headerNo bypass (hard limit)
Error Code429 (from middleware)429 (from quota check)
LimitRequests per minute (60-6000)Total agents (3-1000)

Verification

The test endpoints are exempt from rate limiting:

# backend-saas/core/security/__init__.py:30
if any(path.startswith(prefix) for prefix in self.exempted_prefixes) or test_secret:
    return await call_next(request)  # Bypass rate limiting

self.exempted_prefixes = [
    "/api/test",  # ✅ Test endpoints exempt
    ...
]

But quota enforcement still applies:

# backend-saas/api/routes/test_auth_routes.py:381
QuotaManager.check_agent_quota(tenant_id, db)  # ❌ No bypass

---

Priority 1: Test Agent Accumulation

  1. Add cleanup function to delete all test agents after test run
  2. Use unique tenant subdomain per run (e.g., test-{timestamp})
  3. Add agent count logging to debug quota issues

Priority 2: Promotion Test Cross-Tenant Issue

  1. Update test to create admin user in same tenant as agent
  2. Or create test agent in admin tenant for promotion tests
  3. Add tenant_id validation in promotion tests

Priority 3: Episode Feedback Test

  1. Update test to use real agent execution flow
  2. Get episode_id from execution response
  3. Submit feedback for real episode

Priority 4: Edge Case Tests

  1. Fix zero episode count to use minimum valid value (10)
  2. Add delays between concurrent requests
  3. Handle 429 responses with retry logic

---

Files Modified

Backend

  • backend-saas/api/routes/test_auth_routes.py - Fixed admin creation response to include role

Test Script (Pending)

  • scripts/test_50_scenarios.py - Needs updates for:
  • Admin/agent tenant alignment
  • Episode feedback flow
  • Edge case validation
  • Concurrent request handling

---

Next Steps

  1. ✅ **Fixed admin creation response** - Deploy with next deployment
  2. ⏳ **Update test script** - Fix promotion/feedback/edge case tests
  3. ⏳ **Add test cleanup** - Delete agents after each run
  4. ⏳ **Re-run tests** - Verify all fixes work correctly
  5. ⏳ **Document test patterns** - Create test development guidelines

---

Test Execution Command

python3 scripts/test_50_scenarios.py

**Expected Results After Fixes:**

  • Pass rate: 95%+ (up from 72%)
  • All admin tests: Working
  • All edge case tests: Working
  • Episode feedback: Working